Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py2f parallel #457

Open
wants to merge 126 commits into
base: main
Choose a base branch
from
Open

Py2f parallel #457

wants to merge 126 commits into from

Conversation

abishekg7
Copy link
Contributor

No description provided.

Base automatically changed from py2f-with-optimisations to main May 29, 2024 13:39
from icon4py.model.common.settings import device


#try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this removes ghex and mpi4py being optional dependencies, that is you can run icon4py even if you don't have those library installed. We should keep that feature imho.

@@ -20,6 +20,13 @@
EdgeDim = Dimension("Edge")
CellDim = Dimension("Cell")
VertexDim = Dimension("Vertex")
SingletonDim = Dimension("Singleton")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you need those for? Could you move them it to py2fgen? Whatever has something to do with only the interfacing to fortran should go to tools/py2fgen and not bloat up the model.


vertex_end_halo = self.grid.get_end_index(VertexDim, HorizontalMarkerIndex.halo(VertexDim))

loc_rank = self._exchange.my_rank()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either delete the log statements, or keep them but not commented out. You can switch it off globally.



@dataclasses.dataclass
class CachedProgram:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt this a duplication from common/caching.py? it has nothing to do in particular with diffusion so it should not be here...

@@ -163,13 +163,6 @@ def end(cls, dim: Dimension) -> int:
return cls._end[dim]


@dataclass(frozen=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for cleaning up...

)
#processor_props = get_multinode_properties(MultiNodeRun(), comm_id)
#exchange = definitions.create_exchange(processor_props, decomposition_info)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete commented out code or uncomment if they remain useful. You can also extract them into a function such that they pollute the code less here. You can always fix whether they are called or not throught the log level. If you run with anything above DEBUG the will not get triggered.

):
logger.info(f"Using Device = {device}")
log.info(f"Using Device = {device}")

# ICON grid
if device.name == "GPU":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could shorten this by

Suggested change
if device.name == "GPU":
on_gpu = (device.name == "GPU")

or at least with an if expression if you think that is cryptic.

on_gpu = True if device.name == "GPU" else False

cells_end_index,
vertex_start_index,
vertex_end_index,
edge_start_index,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add type annotations.

@@ -77,6 +97,88 @@ def load_grid_from_file(
return gm.get_grid()


def construct_icon_grid(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you only use this interface (passing all arrays in one function) for the grid construction in the wrapper maybe you should also move it there? For the python code I think its nicer to use the builder pattern directly.

For sure I think this should not be in test_utils. For practical reasons test_utils is in the production code, but as the name says it is meant for testing. So I don't know whether you use any test infrastructure in the Fortran production code.

@@ -370,6 +370,9 @@ def __init__(self, exchange: ExchangeRuntime = SingleNodeExchange()):
self.cell_params: Optional[CellParams] = None
self._horizontal_start_index_w_diffusion: int32 = 0

def set_exchange(self, exchange):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need to set this and make it mutable?

In general, we should discuss if there is a way that your granule interfacing works with having the distinction between the initand __init__ in the python granule.

@@ -297,7 +297,7 @@ def __post_init__(self, config):
object.__setattr__(
self,
"scaled_nudge_max_coeff",
config.nudge_max_coeff * DEFAULT_PHYSICS_DYNAMICS_TIMESTEP_RATIO,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here saying that ICON already scales this by 5, and that therefore it is the responsibility of the user to set nudge_max_coeff accordingly



def fortran_grid_indices_to_numpy_offset(inp) -> np.ndarray:
#return np.subtract(xp.asnumpy(inp.ndarray, order="F").copy(order="F"), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove dead code



def fortran_grid_indices_to_numpy(inp) -> np.ndarray:
#return xp.asnumpy(inp.ndarray, order="F").copy(order="F")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove dead code

Comment on lines 182 to 202
cells_start_index_np = fortran_grid_indices_to_numpy_offset(cells_start_index)
vert_start_index_np = fortran_grid_indices_to_numpy_offset(vert_start_index)
edge_start_index_np = fortran_grid_indices_to_numpy_offset(edge_start_index)

cells_end_index_np = fortran_grid_indices_to_numpy(cells_end_index)
vert_end_index_np = fortran_grid_indices_to_numpy(vert_end_index)
edge_end_index_np = fortran_grid_indices_to_numpy(edge_end_index)

c_glb_index_np = fortran_grid_indices_to_numpy_offset(c_glb_index)
e_glb_index_np = fortran_grid_indices_to_numpy_offset(e_glb_index)
v_glb_index_np = fortran_grid_indices_to_numpy_offset(v_glb_index)

c_owner_mask_np = c_owner_mask.ndarray.copy(order="F")[0:num_cells]
e_owner_mask_np = e_owner_mask.ndarray.copy(order="F")[0:num_edges]
v_owner_mask_np = v_owner_mask.ndarray.copy(order="F")[0:num_verts]

c2e_loc = fortran_grid_connectivities_to_xp_offset(c2e)
c2e2c_loc = fortran_grid_connectivities_to_xp_offset(c2e2c)
v2e_loc = fortran_grid_connectivities_to_xp_offset(v2e)
e2c2v_loc = fortran_grid_connectivities_to_xp_offset(e2c2v)
e2c_loc = fortran_grid_connectivities_to_xp_offset(e2c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed it would be cleaner if you could delegate this to a more generic function or call it in a loop so you don't have to have multiple statements calling the same function over and over again.

vert_start_index_np = fortran_grid_indices_to_numpy_offset(vert_start_index)
edge_start_index_np = fortran_grid_indices_to_numpy_offset(edge_start_index)

cells_end_index_np = fortran_grid_indices_to_numpy(cells_end_index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use more clear naming for these functions

on_gpu=on_gpu,
limited_area=True if limited_area else False,
decomposition_info = (
DecompositionInfo(klevels=num_levels)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to avoid using array variable names with np suffixes unless strictly necessary. It would be nicer to try and keep the code as generic as possible (supporting both cupy and numpy arrays). If conversion is necessary somewhere do it where it is needed.

processor_props = get_multinode_properties(MultiNodeRun(), comm_id)
exchange = definitions.create_exchange(processor_props, decomposition_info)

# log.debug("icon_grid:cell_start%s", icon_grid.start_indices[CellDim])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove dead code, or if these debug statements are useful I would pack them in a separate function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if you decide for the separate function see if you can make the debug statements less repetitive, e.g. using for loops.

@@ -223,6 +360,9 @@ def diffusion_init(
geofac_grg_y=geofac_grg_y,
nudgecoeff_e=nudgecoeff_e,
)
global diffusion_granule
Copy link
Contributor

@samkellerhals samkellerhals Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since using globals is bad practice in Python please put a comment here explaining why we are doing this.

Copy link

Mandatory Tests

Please make sure you run these tests via comment before you merge!

  • cscs-ci run default
  • launch jenkins spack

Optional Tests

To run benchmarks you can use:

  • cscs-ci run benchmark

To run tests and benchmarks with the DaCe backend you can use:

  • cscs-ci run dace

In case your change might affect downstream icon-exclaim, please consider running

  • launch jenkins icon

For more detailed information please look at CI in the EXCLAIM universe.

@@ -50,6 +52,7 @@
if TYPE_CHECKING:
import mpi4py.MPI

ghex_arch = Architecture.GPU if device.name == "GPU" else Architecture.CPU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

settings.device is an enum. Why not directly test for the enum value instead of using the device.name? If you want to compare with a string, you could turn the settings.py::Device into a string enum

class Device(str, Enum) 
    ...

or

class Device(StrEnum):

prognostic_state.w,
prognostic_state.theta_v,
prognostic_state.exner,
prognostic_state.w.ndarray[0 : self.grid.num_cells, :],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try to get rid of this slices and it did not work. Do you know why or what did not work? We should try to figure this out and handle it differently it is very error prone like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the explicit bounds are needed we could try to push it inside GHexMultiNodeExchange.exchange . What you use here is the "real" num_cells, not nproma, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants